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Abstract The problem of statistical recognition is considered, as it arises in immunobiology, 
namely, the discrimination of foreign antigens against a background of the body's own molecules. 
The precise mechanism of this foreign-self-distinction, though one of the major tasks of the immune 
system, continues to be a fundamental puzzle. Recent progress has been made by van den Berg, 
Rand, and Burroughs [33], who modelled the probabilistic nature of the interaction between the 
relevant cell types, namely, T-cells and antigen-presenting cells (APCs). Here, the stochasticity is 
due to the random sample of antigens present on the surface of every APC, and to the random 
receptor type that characterises individual T-cells. It has been shown previously [33ll37] that this 
model, though highly ideaHsed, is capable of reproducing important aspects of the recognition 
phenomenon, and of explaining them on the basis of stochastic rare events. These results were 
obtained with the help of a refined large deviation theorem and were thus asymptotic in nature. 
Simulations have, so far, been restricted to the straightforward simple sampHng approach, which 
does not allow for sample sizes large enough to address more detailed questions. Building on the 
available large deviation results, we develop an importance sampling technique that allows for 
a convenient exploration of the relevant tail events by means of simulation. With its help, we 
investigate the mechanism of statistical recognition in some depth. In particular, we illustrate how 
a foreign antigen can stand out against the self background if it is present in sufficiently many 
copies, although no a priori difference between self and nonself is built into the model. 
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1 Introduction 

The notion of statistical recognition between randomly encountered molecules is central to many 
biological phenomena. This is particularly evident in biological repertoires, which contain enough 
molecular diversity to bind practically any randomly encountered target molecule. The recep- 
tor repertoire of the immune system provides the best-known example of a system displaying 
probability-based interactions; another one is the olfactory receptor repertoire, which recognises 
multitudes of odorants. This chance recognition is a well-established phenomenon and has been 
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analysed with the help of various statistical and biophysical models; compare [r7|l23j . Here we 
will tackle a model of statistical recognition between cell surfaces (in the sense of collections of 
numerous surface molecules, rather than single ones) of the immune system. It describes a vital 
property of our immune system, which comes into play when a virus invades the body and starts 
to multiply. Fortunately, however, sooner or later it is recognised as a foreign intruder by certain 
white blood cells, which are part of the immune system and start a specific immune response that 
finally eliminates the virus population. 

This ability of the immune system to discriminate safely between foreign and self molecules is 
a fundamental ingredient to everyday survival of jawed vertebrates; but how this works exactly is 
still enigmatic. Indeed, the immune system faces an enormous challenge because it must recognise 
one (or a few) type(s) of (potentially dangerous) foreign molecules against an enormous variety 
of (harmless) self molecules. The particular difficulty Hes in the fact that there can be no a priori 
difference between self and nonself (like some fundamental difference in molecular structure) , since 
this would open up the possibility for molecular mimicry on the part of the pathogen, which could 
quickly evolve immuno-invisibility by imitating the self structure. The problem may be phrased as 
statistical recognition of one particular foreign signal against a large, fiuctuating self background. 
However, immune biology has been largely treated deterministically until, recently, an explicit 
stochastic model was introduced by van den Berg, Rand and Burroughs [33] (henceforth referred 
to as BRB) and further developed by Zint, Baake and den Hollander ^7\. It describes (random) 
encounters between the two crucial types of white blood cells involved (see Figs. [T] and [21): the 
antigen-presenting cells (APCs), which display a mixture of self and foreign antigens at their 
surface (a sample of the molecules around in the body), and the T-cells, which "scan" the APCs 
by means of certain receptors and ultimately decide whether or not to react, i.e., to start an 
immune response. 
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Fig. 1 A T-cell and an antigen-presenting cell (based on Fig. 1 of |35|1. An APC absorbs molecules 
and particles from its vicinity and breaks them down. The emerging fragments, so-called peptides (short 
sequences of amino acids), serve as antigens. They are bound to so-called MHC molecules (still within the 
cell), and the resulting complexes, each composed of an MHC molecule and a peptide, are presented on 
the surface of the cell (the MHC molecules serve as carriers or anchors to the cell surface). Since most of 
the molecules in the vicinity of an APC are self molecules, every APC displays a large variety of different 
types of self antigens and, possibly, one (or a small number of) foreign types. The various antigen types 
occur in various copy numbers. Each T-cell is characterised by a specific type of T-cell receptor (TCR), 
which is displayed in many identical copies on the surface of the particular T-cell. When a T-cell meets 
an APC, the contact between them is established by a temporary bond between the cells, in which the 
TCRs and the MHC-peptide complexes interact with each other, which results in stimuli to the T-cell 
body. If the added stimulation rate is above a given threshold, the T-cell is activated to reproduce, and 
the resulting clones of T-cells will initiate an immune reaction against the intruder. 
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To be biologically more precise, we consider the encounters of so-called naive T-cells with 
professional APCs in the secondary lymphoid tissue. A naive T-cell is a cell that has finished its 
maturation process in the thymus and has been released into the body, where it has not yet been 
exposed to antigen. It tends to dwell in secondary lymphoid tissue like lymph nodes, where it comes 
into contact with professional APCs, special white blood cells with so-called MHC molecules at 
their surface that serve as carriers for antigens. Each T-cell is characterised by a specific type of 
T-cell receptor (TCR) , which is displayed in many identical copies on the surface of the particular 
T-cell. A large number (estimated at 10'' in [l]) of different receptors, and hence different T-cell 
types, are present in an individual (every type, in turn, is present in several copies, which form a 
T-cell clone). However, the number of potential antigen types is still vastly larger (roughly 10^'^; 
see [20]). Thus, specific recognition (where one TCR recognises exactly one antigen) is impossible; 
this is known as Mason's paradox. The task is further complicated by the fact that every APC 
displays on the order of thousand (s) of different self antigen types, in various copy numbers [HI 
[201128) ■ together with, possibly, one (or a small number of) foreign types; the T-cells therefore face 
a literal "needle in a haystack" problem. 

For an encounter between a pair of T-cell and APC, both chosen randomly from the diverse 
pool of T-cells and APCs, the probability to react must be very small (otherwise, immune reactions 
would occur permanently); this is a central theme in the analysis. It entails that some questions 
may be answered analytically with the help of large deviation theory; others require simulation, 
but the use of this has been limited due to the small probabilities involved, at least with the 
straightforward simulation methods applied so far [33|l37j . The main purpose of this article is to 
devise an efficient importance sampling method based on large deviation theory and tailored to 
the problem at hand, and to use this to investigate the mechanism of statistical recognition in 
more detail. The paper is organised as follows. In Sect. 2, we present the most important biological 
facts and recapitulate the model; this will be a self-contained, but highly simplified outline, since 
the full picture is available elsewhere |33P37). In Sect. 3, we summarise (mainly from [9] and [5]) 
some general theory that allows to design efficient methods of rare event simulation on the basis 
of a large deviation analysis, and tailor these to the problem at hand in Sect. 4. Sect. 5 presents 
the simulation results and analyses them both from the computational and the biological point of 
view. Simulation speeds up by a factor of nearly 1500 relative to the straightforward approaches 
used so far. This enables us to explore regions of parameter space as yet inaccessible, to validate 
previous asymptotic results, and to investigate the mechanism of statistical recognition in more 
depth than previously possible. 

2 The T-cell model 

In this Section, we briefiy motivate and introduce the model of T-cell recognition as first proposed 
by BRB in 2001 [33] and further developed by Zint, Baake and den Hollander [37]. More precisely, 
we only consider the toy version of this model, which neglects the modification of the T-cell 
repertoire during maturation in the thymus. This toy version already captures important aspects 
of the phenomenon while being particularly transparent. We will come back to maturation (already 
included in [33]) in the discussion. 

When T-cells and APCs meet, the T-cell receptors bind to the various antigens presented by the 
APC [6]. For every single receptor-antigen pair, there is an association-dissociation reaction, the 
rate constants for which depend on the match of the molecular structures of receptor and antigen. 
Assuming that association is much faster than dissociation and that there is an abundance of 
receptors (so that the antigens are mostly in the bound state), one can describe the reaction in 
terms of the dissociation rates only. 

Every time a receptor unbinds from an antigen, it sends a signal to the T-cell, provided the 
association has lasted for at least one time unit (i.e., we rescale time so that the unit of time is 
this minimal association time required) . The duration of a binding of a given receptor-antigen pair 
follows the Exp(1/t) distribution, i.e. the exponential distribution with mean t, where r is the 
inverse dissociation rate of the pair in question. The rate of stimuli induced by the interaction of 



Fig. 2 Caricature of T-cells and APCs (from [37)1. Note that every T-cell has many copies of one particular 
receptor type, but different T-cells have different receptor types. In contrast, every APC carries a mixture 
of antigen types, which may appear in various copy numbers. 

our antigen with the receptors in its vicinity is then given by 

w(t) = -exp(--), (1) 
r T 

i.e., the dissociation rate times the probability that the association has lasted long enough. (If 
the simplifying assumption of unlimited receptor abundance is dispensed with, Eq. ^ must be 
modified, see |34j.) As shown in Fig. [Sj the function w first increases and then decreases with r 
with a maximum at r = 1, which refiects the fact that, for r < 1, the bindings tend not to last 
long enough, whereas for t > 1, they tend to last so long that only few stimuli are expected per 
time unit. 
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Fig. 3 Left: the function w. Right: the densities of = w{T) and with tilting parameter i9 = 46 
(cf. Sect. l3.2( l. The densities have poles at w{Q) = and io(l) = 0.3679 (due to the vanishing derivative of 
ui at r = and r = 1), but the right poles are invisible because they support very little probability mass. 
In fact, for e = 0.01, one has P(0 < < e) = 0.98 and P(w(l) - e <W < w{l)) = 2.17 • 10-^ whereas 
P(0 < < e) = 0.137138 and P(w(l) -e<W'^ < w{l)) = 0.0050. 



The T-cell sums up the signals induced by the different antigens on the APC, and if the total 
stimulation rate reaches a certain threshold value, the cell initiates an immune response. This 
model relies on several hypotheses, which are known as kinetic proofreading [2Tll22p i8 ] fT3]. serial 
triggering [3Tll30P27ll4ll29pi0j . counting of stimulated TCRs [361125]. and the optimal dwell-time 
hypothesis [T5lfT2]. 
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Due to the huge amount of different receptor and antigen types, it is impossible (and unnec- 
essary) to prescribe the binding durations for all pairs of receptor and antigen types individually. 
Therefore, BRB chose a probabilistic approach to describe the meeting of APCs and T-cells. A 
randomly chosen T-cell (that is, a randomly chosen type of receptor) encounters a randomly cho- 
sen APC (that is, a random mixture of antigens). The mean binding time that governs the binding 
of this random receptor to the jth type of antigen is taken to be a random variable denoted by Tj . 
The Tj are independent and identically distributed (i.i.d.) and are assumed to follow the Exp(l/f) 
distribution, i.e., the exponential distribution with mean f, where f is a free parameter. Note 
that there are two exponential distributions (and two levels of averaging) involved here. First, the 
duration of an individual binding between a type-j antigen and a random receptor is Exp(l/7}) 
distributed (see the discussion of Eq. H])). Second, 7^ , the mean duration of such a binding (where 
the receptor is chosen once and the times are averaged over repeated bindings with a j antigen) 
is itself an exponential random variable, with realisation Tj. Finally, its mean, ^{Tj) = f, is the 
mean binding time of a j-antigen (and, due to the i.i.d. assumption, of any antigen) when averaged 
over all encounters with the various receptor types. The exponential distribution of the individual 
binding time is an immediate consequence of the (first-order) unbinding kinetics. In contrast, the 
corresponding assumption for the Tj is made for simplicity; the approach is compatible with var- 
ious other distributions as well, see [33] and [37]. The i.i.d. assumption, however, is crucial, since 
it impHes, in particular, that there is no difference between self and foreign antigens here; i.e., no 
a priori distinction is built into the model. 

The total stimulation a T-cell receives is the sum over all stimulus rates Wj = w{Tj) that 
emerge from antigens of the j'th type. It is further assumed that there is at most one type of 
foreign antigen in z^^^ copies on an APC, whose signal must be discriminated against the signals 
of a huge amount of self antigens. (There could, in principle, be multiple foreign peptide types, 
but there are good reasons to assume that there are mechanisms to ensure that a given T-cell sees 
at most one foreign peptide type, see [34]). The self antigens are here divided into two distinct 
classes, c and w, that are present in different copy numbers z''^^ and z^^\ An APC displays m^'^^ 
and m^'") different types of class c and v. The indices c and v stand for constitutive and for 
variable, respectively; but for the purpose of this article, only the abundancies are relevant, in 
particular, z'-'^^ > z^'") and m^^' < m'"^ . Over the whole APC the total number of antigens is then 
_|_ =. if foreign antigen is present. If z*-^-* foreign molecules are also present, 

the self molecules are assumed to be proportionally displaced (via the factor q := (M — z^^'>)/M), 
so that the total number of antigens remains unchanged at 

^U) + m^<^)qz''<^) + TO^gzW = M. (2) 

The total stimulation rate in a random encounter of T-cell and APC can then be described as 
a function of z^^^: 

G(z(/)) 9^^'^^^- + E '^^^''^J- (3) 

\i=l / \j=m(')+l J 

i.e., a weighted sum of i.i.d. random variables. Alternatively, we consider the extension of the 
model proposed by Zint et al. |37], which, instead of the deterministic copy numbers z'^'^\z^^\ 
uses random variables Z^^\z^^'^ distributed according to binomial distributions with E(zj^'') = 

z^'^^ , E(zj"'') — z^^\ where E denotes expectation (so the expected number of antigens per APC 
is still M). The model then reads 

G(z(/)) := Y ^^f^^ + E ^^f^^'^ + -^^^^™(^.+™<".+i- (4) 

In line with [331137]. we numerically specify the model parameters as follows: f = 0.04; m^^' = 
50, m('') = 1500, z^'^) = 500, z'"' = 50 (and hence M = 10^). The distributions in the extended 
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model are the binomials Bm{('-'^\p) and Bin{('-'"\p) for and Zj respectively, where C'"'^'' = 
1000, C^"^ = 100, andp = 0.5. 

The relevant quantity for us is now the probability 

P(G(z(^)) > Pact) (5) 

that the stimulation rate reaches or surpasses a threshold ^act- To achieve a good foreign-self 
discrimination, there must be a large difference in probability between the stimulation rate in the 
case with self antigens only {z'^^^ = 0), and the stimulation rate with the foreign antigen present, 
i.e., 

1 » P(G(z(-^') > 5act) » P(G(0) > 5act) > (6) 

for reaHstic values of z^^\ Note that both events must be rare events - otherwise, the immune 
system would "fire" all the time. Thus ^act must be much larger than E{G{z^^'>)) (which, due 
to ([2]) and the identical distribution of the Wj, is independent of z^-''^). Evaluating these small 
probabilities is a challenge. So far, two routes have been used: analytic (asymptotic) theory based 
on large deviations (LD) and straightforward simulation (so-called simple sampling). Both have 
their shortcomings: the LD approach is only exact in the limit of infinitely many antigen types 
(and the available error estimates are usually too crude to be useful); the simulation strategy, on 
the other hand, is so time-consuming that it becomes simply impossible to obtain sample sizes 
large enough for a detailed analysis, in particular for large values of ^act- Therefore, an importance 
sampling approach is required. Let us now recapitulate some underlying theory. 

3 Rare event simulation: general theory 

The general problem we now consider is to estimate the probability P{A) of a (rare) event A 
under a probability measure P. The straightforward approach, known as simple sampling, uses 
the estimate 

{P{A))^ := - ^ eA} = -card{l <t<N\ 5^ € A}, (7) 

i=l 

where the {5'^'^}i<i<7v are independent and identically distributed (i.i.d.) random variables with 
distribution P, denotes the indicator function, and N is the sample size; we will throughout 

use V for an estimate of a quantity v. {P{A))n is obviously an unbiased and consistent estimate, 
but, for small P{A), the convergence to P{A) is slow, and large samples are required to get reliable 
estimates. 

Various simulation methods are available that deal with this problem and yield a better rate 
of convergence (see the monograph by Bucklew [5] for an overview). Most of them achieve this 
improvement by reducing the variance of the estimator. We will concentrate here on the most 
wide-spread class of methods, namely importance sampling. As is well known, one introduces a 
new sampling distribution Q here under which A is more likely to happen, produces samples from 
this distribution and returns to the original distribution by reweighting. In general, finding a good 
importance sampling distribution that reduces the variance as much as possible is an art, and 
much of the literature revolves around this. Some general purpose and many ad hoc strategies 
exist, but usually, importance sampling distributions are best tailored by exploiting the structure 
of the specific problem at hand. However, if the problem can be embedded into a sequence of 
problems for which a so-called large deviation principle is vaHd, a unified theory is available that 
identifies the most efficient simulation distribution. This technique of "large deviation simulation" 
was introduced by Sadowski and Bucklew [2^, laid down in the monograph by Bucklew [5], and 
further developed by Dicker and Mandjes [9]. It rests on the well-established theory of large 
deviations, as summarised, for example, in the books by Dembo and Zeitouni [7] or den Hollander 
[H]. Let us recapitulate the basic background. 
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3.1 Large deviation probabilities 

Consider a sequence {S'„} of random variables on the probability space {R'^,B,P), where B is 
the Borel cr-algebra of M'^. Let {Pn} be the family of probability measures induced by {Sn}, i.e., 
Pn{B) = V{Sn e B) for B e B. We assume throughout that {Sn} satisfies a large deviation 
principle (LDP) according to the following definition [7l[9]: 

Definition 1 (Large deviation principle) A family of probability measures {Pn} on (R'',^) 
satisfies the large deviation principle (LDP) with rate function / if / : R'^ — > [0, oo] is lower 
semicontinuous and, for all B £ B, 

- inf I{x) < lim inf - log P„ (B) < lim sup - log P„ (B) < - mf_ I{x) , (8) 

where B° := int(i3) and B :— clos(i3) denote the interior and the closure of B, respectively. / is 
said to be a good rate function if it has compact level sets in that /~^([0, c]) = {x e R"^ : I{x) < c} 
is compact for all c £ R''. □ 

A set B is called an I- continuity set if 

inf I{x) = inf I{x) = mi_I{x). (9) 

x£B° x£B 

If B is such a set, the LDP means that Pn{B) decays exponentially for large n, with decay 
coefficient vaix^B A point b is called a minimum rate point of B if inf^es I{x) = I{b). 

Large deviation principles are well known for many families of random variables, like empirical 
means of i.i.d. random variables or empirical measures of Markov chains. For the application we 
have in mind, which involves sums of independent, but not identically distributed random variables, 
we need the fairly general setting of the Gartner-Ellis theorem, which we recapitulate here (cf. 
[a Thm. 2.3.6] and [HI Ch. V]). Let <^„(i?) := Ep„ (e^'''^"^), -d e R'', be the moment-generating 
function of S'„, where (., .) denotes the scalar product and E^(.) denotes the expectation of a 
random variable with respect to the probability measure /i. 

Theorem 1 (Gartner-Ellis) Assume that 
(Gl) lim„_,oo ^ log V'n ("-i^) =: ^('?) G [— oo, cx)] exists, 

(G2) e int(Pyi), where Va := {?? e R"* : A{'d) < oo} is the effective domain of A, 
(G3) A is lower semi-continuous on R^, 
(G4) A is differentiable on int(2?/i), 

(G5) Either Va = R'^ or A is steep at its boundary OVa, i-e., limi„t(D^)3,5-»aD^ l^^(^)l — oo. 

Then, {P,i} satisfies the LDP on R'^ with good rate function I, where I is the Legendre transform 
of A, i.e., 

I{x) ^ suY,[{x,d) - A{^)], xeM."^. (10) 

□ 

The function Am (Gl) is convex. If there is a solution d* of 

\IA{d)^x, (11) 

one has 

i{x) ^ {r,x) ~ A{r). (12) 

If vl is strictly convex in all directions, 'd* is unique. See Fig.|4]for a one-dimensional example (the 
T-cell application, in fact). 
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3.2 Simulating rare event probabilities 

Let now A e S be a rare event in the sense that < mixeA Hx) < oo. Here, the first inequality 
implies that A becomes exponentially unlikely as n — > cx), whereas the second inequality serves to 
exclude nongeneric cases (in particular cases where the event is impossible). An important notion 
for the rare event simulation of Pn{A) is that of a dominating point [H p. 83]: A point a is a 
dominating point of the set A if it is the unique point such that 

a) a e dA, 

b) 3 a unique solution t9* of VA{-d) ~ a, and 

c) ^ C {x G M"* : {d*,x-a) > 0}. 

A dominating point, if it exists, is always a unique minimum rate point (see [3 p. 83]). Con- 
vexity of A implies existence of a dominating point (cf. (3). 

Following [9] we now turn to the problem of simulating P„(A) = Ep^(l{5'„ G A}). The naive 

simple-sampling estimate obtained from N i.i.d. copies 5*1'' (1 < i < ^), drawn from P„, IS, as m 
0, given by 

{pjA))^:=^f2MSn^ ^A}. (13) 
i=i 

It is unbiased and converges (almost surely) to Pn{A) in the limit iV — > oo, but it is inefficient 
since it requires that N increase exponentially with n to yield a meaningful estimate. Instead of 
{Sn}, one therefore considers an alternative family of random variables, {T„} with distribution 
family {Qn}, again on (IR'', B), under which A occurs more frequently. Assuming that P„ and Qn 
are absolutely continuous with respect to each other, one can use the identity 

p„(A) = Ep„(i{5„ e A}) = EQ„(i{r„ e a}^(t„)), (14) 

where dP„/d(5„ is the Radon-Nikodym derivative of P„ with respect to Q„. The resulting impor- 
tance sampling estimate then relies on i.i.d. samples Tn'' from {Qn} and reads 

1 w HP 

i—l 

where (dP„/dQ„)(.) acts as a reweighting factor from the sampling distribution to the original 
one. It is reasonable to assume that (dP„/dQ„) is continuous to avoid the usual problems with 
L^-functions; this is no restriction for our intended application. 

An adequate optimality concept in this context is that of asymptotic efficiency. According to 
[9], it is based on the relative error r]j^{Qn,A) defined via its square 

{PniA)) 

(where Vp(.) denotes the variance of a random variable with respect to the probability measure /i). 
The relative error is proportional to the width of the confidence interval relative to the (expected) 
estimate itself. Asymptotic efficiency is then defined as follows. 

Definition 2 (Asymptotic efficiency) An importance sampling family {Qn} is called asymp- 
totically efficient for the rare event A if 

lim ilogA^5„ =0, (17) 

where Nq^ := inf{A^ G N : 7]j^{Qn,A) < r/max} for some given maximal relative error 
< »7max < oo- 



9 



In words, asymptotic efficiency means that the number of samples required to keep the relative 
error below a prescribed bound ?7„jax increases only subexponentially (rather than exponentially 
as with simple sampling). The concrete choice of ?7,jiax is actually irrelevant, see Lemma 1 in [9]. 

An obvious idea from large deviation theory would be to use, as sampling distributions, the 
family of measures {P^} that are exponentially tilted with parameter -d, that is, 

dPn Vn{nd) 

then takes the role of Qn- The task remains to find a suitable iS, i.e., a tilting parameter 
that makes {P^} asymptotically efficient. Necessary and sufficient conditions for this are given in 
[9l Assumption 1 and Corollary 1] and are summarised below, in a form adapted to the present 
context. 

Theorem 2 (Dieker-Mandjes 2005) Assume that, for some given iS* , 
(VI) {Pn} satisfies an LDP with good rate function I , 

(V2) limsup^^oc i log (pnijnd*) < oo for some 7 > 1, and, likewise, with 'd* replaced by —-d*, 
(V3) The rare event A is both an I-continuity set and an {I + (■&*, .)) -continuity set. 

Then, the tilted measure } is asymptotically efficient for simulating A if and only if 

inf [I(x) - + inf [/(x) + {^*,x)] = 2 inf I{x). (19) 

xGW^ xeA 2:eA° 

We use assumption (V2) here to replace the weaker but less easy to verify condition (2) in 
Assumption 1 of [9], in line with the paragraph below (2) in [9], or [Tj Thm. 4.3.1]. Note also that 
(V2) holds automatically if (p„(ni?) exists for all i? - but this is not mandatory here, since only a 
given is considered. 

The proof of Theorem [2] is given in [9] and need not be recapitulated here; but we would like 
to comment briefly on what happens in the central condition (fT9| . Replacing Qn by P^ in lfT6| 
and l|15p. we can rewrite 77^ as 

^, ¥pj,^{P^^A))n lYp.'{P'^A))i 
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(20) 



Obviously (by (VI) and (V3)), 2mfx£A° (i.e., the right-hand side of lfT9|) ) is the exponential 
decay rate of (P„(^))^. Inspection of the proof of Theorem [2] reveals that the left-hand side of 



(fT9|) is the exponential decay rate of ydp^J '^^n ■ It is clear from (|20|) that, for asymptotic 
efficiency to hold, I ^p^* 1 dP^ must tend to zero at least as fast as (P„(^))^. But it cannot 



decrease faster, since Vpo* (Ppo- (A))i is nonnegative, so that Xi (^dQ^j '^'S" — (^n(^))^ fo'" 
arbitrary Qn- Hence, the exponential decay rates must be exactly equal, as stated by lfT9|) . (A 
closely related argument is given in O Ch. 5.2].) 

Theorem[2]is widely applicable. It holds in many standard situations, in particular in many of 
those that arise in applications. 

Proposition 1 Let {Pn} be a family of probability measures that satisfy the conditions of the 
Gartner-Ellis theorem, with (good) rate function I . Let A be a rare event with dominating point a, 
let -d* be the unique solution ofWA{d) = a, and assume (V2) and (V3). Then {Pn } is the unique 
tilted family that is asymptotically efficient for simulating Pn (A) . 
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Proof The proof is a simple application of Thm. [2l (VI) follows from the Gartner-Ellis theorem; 
we only need to verify condition lfT9|) . For the first infimum in (fT9| . one obtains 

inf [I{x) - {'&*,x)] = -A{d*) = I{a) - (r , a). (21) 

Here, the first step follows from the convex duality lemma (compare [II Lemma 4.5.8]), which is 
applicable since A is lower semicontinuous by (G3), and convex and > — oo everywhere (this follows 
from (Gl) and (G2) by [H Lemma V.4]). The second step is due to part b) of the dominating 
point property of a, together with Eq. fT2|). 

As to the second infimum in ifTO]) . a minimises both / and (i?*, .) on A (by the dominating 
point property). Together with (V3), this gives 

mi_[I{x) + {r,x}]=Iia) + {r,a}. (22) 

Eqs. (|2T|) and (|22|) together give (flOl) because mfxeA° lix) — inf^gg^ -^(a^) = -^(a)- □ 

Remark 1 Note that an efficiency result closely related to Proposition [T] has previously been given 
by Bucklew [U Thm. 5.2.1], but this is based on the variance rather than the relative error; and 
it is only a sufficient condition. 

Note also that our assumption of a dominating point greatly simplifies the situation. Theorem 2 
also allows to cope with situations without a dominating point - but this is not needed below. 

Let us now apply this theory to the T-cell model. 
4 Rare event simulation: the T-cell model 

Recall that simulating the T-cell model means sampling the random variables G{z^^'>) of ^ and 
estimating the corresponding tail probabilities V{G{z'^^^) > ^act)- Inspection of Eq. ^ reveals two 
difficulties: 

1. G{z^^^) is a weighted sum of i.i.d. random variables, to which the standard results for sums of 
i.i.d. random variables (in particular, Cramer's theorem) are not appHcable. We therefore need 
an extension to weighted sums - or, better, to general sums of independent, but not identically 
distributed random variables, which include weighted sums as a simple special case. This is 
straightforward and will be the subject of Sect. 14.11 In particular, it will be seen that, like in 
the i.i.d. case, every term in the sum must be tilted with the same parameter, but now this 
global tilting factor is a function of all the individual distributions involved. 

2. Simulating the random variables Wj = w{Tj) is straightforward via simple sampling: draw 
Exp(l/f) distributed random numbers Tj (as realisations of TJ) and apply the transformation 
HJ). However, simulating the corresponding tilted variables is a difficult task, for two reasons. 
First of all, there is no indication of how to sample from the tilted distribution via transfor- 
mation of one of the elementary distributions (like Uni[o_i] (the uniform distribution on the 
unit interval), or Exp(A)) for which efficient random number generation is possible. Although 
such a transformation might exist in principle, there is no systematic way of finding it. One 
reason for this is that tilting acts at the level of the densities, but even the original (untilted) 
density of = w(T) is not available explicitly. (With W and T (without indices) we mean 
any representative of the family.) This is because its calculation requires the inverse functions 
and derivatives of the two branches (increasing and decreasing) of the function w, but these 
are unavailable analytically. 

In the absence of a transformation method, one might consider to determine the tilted density 
numerically, integrate it (again numerically) and discretise and tabulate the resulting distri- 
bution function. However, this is, again, forbidding for our particular function w: due to the 
vanishing derivatives at T = and T = 1, the transformation formula for densities yields 
singularities in the density of W at these values, with a sizeable fraction of the probability 
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mass concentrated very close to (see Fig. [S]). This renders numerical calculations unreliable. 
To circumvent these problems, we will, in Sect. 14. 2^ present a sampling method for the tilted 
random variable that is based on tilting T rather than W itself. 



4.1 Large deviations for independent but not identically distributed random variables 

We consider K independent families of i.i.d. M'^-valued random variables, {Y^^^}, . . . , {Y^^^} (i.e., 
the distribution within any given family {Y^''^}, 1 < k < K, is fixed, but the distributions may 
vary across families). Assume that A^'^\'d) := logE(e^'''^i '^), the log moment-generating function 
of Yj''\ is finite for all d GlSJ^ and 1 < k < K (here, E(.) refers to the probability measure induced 
by the random variable involved). Let n'-'^', . . . , n'^-' be positive integers, n := 

K.:=E^/'^ + ---+E^/'''' (23) 

e=i 1=1 

and Pn be the probability measure induced by Sn = Vn/n. In the limit n oo, subject to 
nP^^ /n — > 7*^'^'-' for all 1 <k < /f, the limiting log-moment generating function of {i^n} becomes 

K [k) 

A{-d) = lim ilogE(e<^'^">) = lim V —ylW(i?) = V 7^/1^(1^), (24) 

k=l k=l 

where the second step is due to independence. Since, by assumption, A^''\-d) < 00 for all G M"^ 
and I < k < K, the /l'*''' are differentiable on all of R'^ (see [Zl Lemma 2.2.31]); in fact, they are 
even C°°(M'') Ex.ercise 2.2.24]. Thus, A is C°°(M'^) as well. 

By l|24p . we have (Gl). Again due to (t?) < 00, (G2) and (G5) are automatically satisfied. 
Furthermore, the differentiability of A entails (G3) and (G4). We have therefore shown 

Lemma 1 Under the assumptions of this paragraph, {Pn} satisfies the Gartner-Ellis theorem, 
with rate function I given by Eq. (fTO|) . □ 

Such {Pn} are therefore candidates for efficient simulation according to Prop.[TJ The tilting factor 
'd* may not be accessible analytically, but can be evaluated numerically from (fTTI) . Due to inde- 

(k) 

pendence, tilting of Sn with n'd* (that is, tilting of Vn with z?*) is equivalent to tilting each 
with t?*. 



4.2 Tilting of transformed random variables 

Unlike the Wj, the Exp(l/f)-distributed random variables Tj are tilted easily (tilting with 1? simply 
gives Exp(— ^ -|- 1/r)). One is therefore tempted to tilt the Tj rather than the Wj, or, in other 
words, to interchange the order of tilting and transformation. The following Theorem states the 
key idea. 

Theorem 3 Let X be an W'- -valued random variable with probability measure fj., and let Y := hoX 
(or Y = h(X) by slight abuse of notation), where h : ^ M.'^ is ji-measurable. Then Y has 
probability measure v ~ ^ o h^^ , where h^^{y) denotes the preimage of y. Assume now that 
Ep(e^'''''''''-'^) exists, let X^ be an W^-valued random variable with probability measure fi^ related 
to /i via 
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(so that jj,"^ <C n), and let Y"^ — h{X^). Then, the measures (ofY^) and i/^ (for the tilted 
version of v, belonging to Y"^ ) are equal, where <^ u with Radon-Nikodym density 

Proof Note first that e^''-^^ is clearly ^-measurable, and 

E,(e<'''^>)= / e<'''^>di.(y) = / e<'''''(-»d/.(x) = E^(e<'''''W>), (27) 



which exists by assumption, so is well-defined. We now have to show that [B) = i'^{B) for 
arbitrary Borel sets B. Observing that P'^ = ^^oh~^ and employing the formulas for transformation 
of measures [3l (13.7)] and change of variable [3l Thm. 16.13], together with l(25|) . one indeed obtains 



:>^{B) = f^\h-'iB)) = f ^(x)d/i(x-) = \ / e<^-''(-)>dM(x) 



(28) 



E.(e('''^))7B Jb d,y 

which proves the claim. □ 

In words. Theorem [3] is nothing but the simple observation that, to obtain the tilted version of 
Y = h{X), one can reweight the measure ^ of X with the factors e^^''^^^''\ rather than reweighting 
the measure ly of Y with e^'''^^ It should be clear, however, that the measure fl^ diflFers from 
the usual tilted version of fi, which would involve tilting factors e^'''^^ rather than e^''''''^'^^^; for 
this reason, we use the notation fl"^ rather than /x''. Such kind of tilting is common in large 
deviation theory (see, e.g., [H Chap. 2.1.2]). Nevertheless, the simple observation above is the key 
to simulation if fi (and fl"^) are readily accessible at least numerically, but v (and ly^) are not. 

This is precisely our situation, with , aW^ and aw (a S {qz^'^\qz^'"\ z^-^^}), respectively, 
taking the roles of X"^ , Y^ and h (we will use /, , g and g^ for the corresponding densities of T, 

, aW, and {aW)^). Still, reweighting of the exponential density of T with e''"'"^'^) does not yield 
an explicit closed-form density, and no direct simulation method is available for the corresponding 
random variables. However, the reweighted densities are easily accessible numerically, in contrast 
to those of W and its tilted variant, . The problem may thus be solved by calculating and 
integrating numerically and discretising and tabulating the resulting distribution function . 
Samples of T'' may then be drawn according to this table (i.e., by formally looking up the solution 
of F(7"*) = U for U ^ Uni[o,i]), and aW^ = aw{T^) is then readily evaluated. The only difficulty 
left is the time required for searching the table. But this is a practical matter and will be dealt 
with in the next paragraph. 



4.3 The algorithm 

Taking together our theoretical results, we can now detail the specific importance sampHng algo- 
rithm for the simulation of the T-cell model of Sect. [H If not stated otherwise, we will refer to the 
basic model ([3]). Recall that it describes the stimulation rate G{z^^'^) and we wish to evaluate the 
probability P(G(z(-^)) > g^ct). 

To apply LD sampHng, let us embed the model into a sequence of models with increasing total 
number n = rS'^^ + n'"-' -|- n^^^ of antigen types, where rL^'^\ rS"\ and n'-^-' are the numbers of 
constitutive, variable and foreign antigen types. (This is an aritificial sequence of models required 
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to formulate the limiting process involved in the theory; in contrast to the original model, there 
can now be multiple foreign antigen types.) Let 



where 




E 

, j=n(<:)+l 




„(c)2(c) _|_ n{v)^(v) 



(29) 



(30) 



(where z'^'^\ z'^"\ and z'^^^ are independent of n). Clearly, G'„(z'^'^)) coincides with G{z^^'^) of |[3| 
if rS'^^ ~ m^'^\ n^"' = m'^'"\ and n'-^-' = m^-^\ where to^-^' = or m^-^^ = 1 depending on whether 



.(/) 



or z^f^ > 0; then, 



We have to consider P(G„(z(-^')/n > 



Pact/™) (this reflects the fact that g^ct must scale with system size). The sequences {Gniz^-^"^)} 
and {Gn{z^^^)}/n take the roles of {Vn} and {^n}, respectively, in Secs. l3.1l and l4.1l with P„ the 
law of Gn{z^-'^)/n] and we consider A = [gact/™, oo) with E(Gm(^:^-'^^)/?Ti) < gact/™ < Mw{l)/m 
(the latter is the maximum value of G„i{z'^-^'>)/m since w(t) has its maximum at t = 1). The 
limit n — > cx) is then taken so that lim„^oo n^'^^/n = m^'^^/m, Wuin^^n'^^^ /n = m^'"'>/m, as well 
as lim„^oo n^-'^^/n = m'^^^/m, that is, the relative amounts of constitutive, variable, and foreign 
antigens approach those fixed in the original model, Q. (Note that, in [37], a different limit was 
employed, namely, n ^ oo with lim„^oo n^'^^nf"^ = Ci £ (0, cxd) and lim„_,oo ^^^''^V'^ = 0; this is 
appropriate for exact asymptotics, but not for simulation, because the asymptotic tilting factor 
to be used in the latter then does not feel the foreign antigens.) 



Lemma 2 Let f be the density o/Exp(1/t) (i.e., f{T) = e '^^'^ /t), and 



?/'(i) := E(e*^) = / exp (^(t)) /(T)dT 



exp t 



,exp(-l/T) 



dT 



(31) 



he the moment-generating 

r of 



of Wi . Under the assumptions of Sect. \4.3[ the unique solution 



5act 

m 



-qz 



^log^(t) 



m 



-qz 



(v) 



^log^(t) 



^log^(t) 



(32) 



parameter for LD simulation of Pn (A) . 



is the unique asymptotically efficient 

Proof Clearly, P„ satisfies the assumptions of Sect. 14.11 Note, in particular, that ilj{t) < oo for all 
t G M since W is bounded above and below, and so 



m 



[C) [V) 1 

lim logE(e''^"(^"'')/") = ^logV'(gz^'^t?) + ^ log V'(g^''''i9) + - log V(^'^'^9) < oo 
-i^oo m m m 

(33) 

for all hence, the Gartner-Ellis theorem holds by Lemma[TJ To verify the remaining assumptions 
of Prop.[Tl recall from Sec. I4.1l that Aij)) is differentiable (with continuous derivative) on all of R. 
The bounds on ga,ct/m lead to 



^'(0) - 



E(G(z 



^ ffact ^ Mw{l) 



lim A'{^). 



(34) 



TO m m -9- 

A is strictly convex (since (d^/di^) log'0(i) is the variance of , the tilted version of W (cf. [21 
Prop. XILl.l]), which is positive since W and hence is nondegenerate). Eq. l(34|) thus entails 
that A'ij)) = ^act/w has a unique solution d* , which is positive (and clearly satisfies (V2)). As a 
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consequence, gact/w is a dominating point of A, which is a rare event since < I{ga.ct/'m) < oo 
(by yl(0) = together with ^ and cf. Fig. IH left). Finally, ^ is a continuit_y_set of both 

I and / + {i}*, ■) simply because / and {i}* , .) are continuous at ffact/™, and A = A°. Realising 
that the right-hand side of ((32l) equals (see also Eq. (20) in [37]), one obtains the claim from 

Prop.m □ 




Fig. 4 The cumulant-generating function A (left) and the rate function / (right) for the T-cell model JS]). 
The slope of the straight line in the left p ane l is a = ga.ct/m, where (/act = 800 and m = 1551. At 
ai) — A{'d) assumes its maximum, 7(a) (cf ifTOjl -lfT2|ll. 



The solution of l(32|) is readily calculated numerically. The function A, and the resulting rate 
function /, are shown in Fig. [H 

As described in Sect. l4.2l we now tilt the density / of the Tj with according to Eq. l(25|) . This 
yields three different densities , depending on the weighting factors a € {gz^^^ , qz'^^^ , z'^^'>}, 
namely 

exp(arz.(r))/(r) h^P {ai^*^-^^^^ - 

As discussed in Sect. 14.21 this is not the density of any known standard distribution (let alone 
an exponential one) , and simulating from it requires numerical integration (which is well-behaved 
since the are numerically well-behaved), and discretisation and tabulation of the resulting 
distribution functions , followed by looking up the solution f'' of {T^ ) = U for U ^ 
Uni[o,i], to finally yield aW^' via aW^' = aw{f^'). 

Searching the table would be the speed- (or precision-) limiting step, requiring OilogD) oper- 
ations if D is the number of discretisation steps. This can be remedied by applying the so-called 
alias method to quickly generate random variables according to the discretised probability distri- 
bution. For a description of the method, we refer the reader to |19[ pp. 25-27], [16], or [2H p. 248]. 
Let us just summarise here that, after a preprocessing step, which is done once for a given distri- 
bution, the method only requires one Unijo^ij random variable together with one multiplication, 
one cutoff and one subtraction (or two Uni[o_i] random variables together with one multiplication, 
one cutoff and one comparison, depending on the implementation) to generate one realisation of 
T"^ , regardless of D (in particular, it does without searching altogether). 

We now have everything at hand to formulate the algorithm to simulate (realisations of) G{z'^^'^) 
of ([3]). (For notational convenience, we will not distinguish between random variables and their 
realisations here). 

Algorithm 1 

compute d* by solving Eq. ((32| numerically 
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calculate the tilted densities f^,a£ {qz^'^\ qz'^"'' , z^^^} , via l(35|) 
for i=l till sample size N do 

for every summand j of ^ generate a sample {T^ )^*^ according to its density fa{j) '^'^^^ 

help of the alias method (here, the upper index (i) is added to reflect sample i, and a{j) is the 

weighting factor of the sum to which j belongs) 

calculate 

\ j=l / \ j=m(<=)+l / 

calculate the indicator function times the reweighting factor (i.e., the i-th summand in Eq. (fT5l) ) 
if {G{z(f^)Y') > g,,t then 

else 

= 

end if 
end for 

calculate (P]?* (A))^ = , as estimate of¥{G{z(-f^) > g^ct). 



4.4 Extension to variable copy numbers 

Let us now consider the extended model in which the copy numbers are themselves random 
variables. This is also covered by the large deviation theory presented above; in particular, Lemma 
[U again applies if the Y^''^ in l(23)) are identified with zj'^^ Wj or zj"^ Wj , respectively. The global 
tilting factor i?* is, in the usual way, calculated as the solution of A'{'d) = 5act/™, where is 
as in {SI with = E(e«^""'"^) replaced by E(?/;(gZ('=)i9)) = E(e«^*"''^), k G {c,v}; see 

Eq. (20) in [13. 

However, the object of tilting now is the jomi distribution ofW and Z'-'^^ (or Z'-^\ respectively), 
that is, dF{T)dH<'^\z) receives the reweighting factor exp(gi?2;u>(r)), where F and i/'^'^^ denote 
the measures of T and Z^''\ k £ {c,v}, respectively. This introduces dependencies between copy 
numbers and stimulation rates. The resulting bivariate simulation task is costly and may offset 
some of the efficiency gain obtained by tilting. 

If, however, the Z^'''> are closely peaked around their means (as is the case for our choice of 
parameters), the following hybrid procedure turns out to be both practical and fast: Draw the Z^'^^ 
from their original (untilted, binomial) distributions; and simulate a tilted version of qW, denoted 
by {qW)^ , by reweighting the original density of qW with cxp((7?9*E(Z('^^)iy), irrespective of the 
actual value of Z. Clearly, this method is not asymptotically efficient, but it is a valid importance 
sampling method that turns out to compare well with the ideal procedure used for the fixed copy 
numbers (see Sec. 15.1.31) . 



5 Results 

Let us now present the results of our simulations in two steps. We first investigate the performance 
of the method, and then use it to gain more insight into the underlying phenomenon of statistical 
recognition. 
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5.1 Performance of the simulation method 

We will examine the performance of the importance-sampling method in three respects: we will 
compare it to simple sampling (the previously-used simulation method) and to the results of exact 
asymptotics (the previously-used analytic method) ; finally, we will quantify the efficiency in terms 
of the relative error (and thus return to the theory of Sect. 13. 2p . In any case, we will consider 
¥{G{z^^'') > gact) as a function of gact (and for various values of the parameter z^^'>). Of course, 
this probability is just one minus the distribution function of G{z^^'>)] in immunobiology, the 
corresponding graph is known as the activation curve. 

Evaluating this graph by LD simulation requires, for each value of ^act to be considered, a 
fresh sample, simulated with its individual tilting factor i?* (recall that this depends on gact via 
((321)). first sight, this looks like an enormous disadvantage relative to simple sampHng, where 
no threshold needs to be specified in advance; rather, the outcomes of the simulation directly 
yield an estimate over the entire range of the activation curve. However, it will turn out that 
this disadvantage is offset many times by the specific efficiency of hitting the rare events in LD 
sampling. (There is room for improvement: the samples that do not hit a given rare event could 
be used to improve the estimates of the more Hkely events.) 

5.1.1 Comparison with simple sampling 

Clearly, both the simple-sampling and the importance-sampling estimates are unbiased and con- 
verge to the true values as iV — > cx). It is therefore no surprise that they yield practically identical 
results wherever they can be compared - and this yields a first quick consistency check for our 
method. 

This is demonstrated in Fig. [U which shows simple sampling (SS) and importance sampling 
(IS) activation curves, each for z^f'> = 1000 and z^f^ = 2000. For SS, iV = 1.3 * 10^ samples, 
{G{z'^^'>))^''\l < i < N, were generated altogether for every graph, whereas for IS, N ~ 10000 
samples were generated for every threshold value considered (from ^act = 100 to ^act ~ 1000 in 
steps of 50), i.e. 1.9 * 10^ samples altogether. Beyond gact = 450 and gact = 800 (for z^-^^ — 1000 
and z^^^ = 2000, respectively), no estimates could be obtained via SS due to the low probabil- 
ities involved, whereas with IS, it is easy to get beyond gact = 900 in either case, although the 
probabilities can get down to 10"^" (note, however, that this far end of the distribution is no 
longer biologically relevant). In terms of runtime, determining an activation curve (over its entire 
range) by SS took 48 hours of CPU time (Intel Pentium M 1.4 GHz 512MB RAM), whereas IS 
required only about 2 minutes (in the threshold regime where the methods are comparable), that 
is, a speedup by a factor of nearly 1500 is achieved. 

We also applied our method to the extended model ^ with binomially distributed copy num- 
bers. Figure El shows the simulation results for two values of z^^\ each for SS and IS. Again, the 
curves agree, as they must. As to runtime, it took about 130 hours to generate the 2 * 10^ samples 
for SS, whereas for IS it took 10 min. to generate the 9.5 * lO"' samples. 

5.1.2 Comparison with exact asymptotics 

A pillar of the previous analysis of Zint et al. [37j (and its precursor BRB [33]) has been so-called 
exact asymptotics. This is a refinement of large deviation theory which yields estimates for the 
probabilities P„(A) themselves, rather than just their exponential decay rates obtained via the 
LDP in Def . [H With standard large deviation theory (and our simulation method) , it shares the 
tilting parameter which is calculated according to Eq. l(32|) : for more details, we refer to [37]. A 
comparison of IS simulation with exact asymptotics is also included in Fig. [H For small values of 
gact, exact asymptotics is slightly imprecise. This is due to the asymptotic nature {n oo) of the 
method, which yields more precise results in the very tail of the distribution, where the deviations 
are truly large. Note that, although our tilting factors agree with those in exact asymptotics, rare 
event simulation does not suffer from this accuracy problem since, due to the reweighting, it is 
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Fig. 5 Estimates of the activation curve, P(G(z^''^') > Pact), in the basic model ^ for z'^' = 1000 and 
z'^' = 2000, as well as for the self background (z'^' = 0), on logarithmic scale. The probabilities were 
estimated independently with simple sampling (SS ), im portance sampling (IS), and exact asymptotics 
based on large deviation theory (LDT) as used in |37) . For IS, 19 values of gact were considered (from 
100 to 1000 in steps of 50), and = 10000 samples were generated for each value (i.e., 1.9 * 10^ samples 
altogether), whereas for the SS simulation. A'' = 1.3 * 10* samples were used over the entire range. The 
SS curves end at gact = 400 and gact = 800, respectively, because larger values were not hit in the given 
sample. The IS and SS graphs agree perfectly until the SS simulation lacks precision. For larger threshold 
values, we see a perfect agreement of the IS and LDT graphs. Note the general feature that, for threshold 
values that are not too small, the activation probability in the presence of foreign antigens is several orders 
of magnitude larger than the self background, i.e. Eq. ^ is satisfied. 



always a valid importance sampling scheme that yields unbiased estimates for every finite n; the 
finite-size effects will only manifest themselves as a certain loss of efficiency, as will be seen below. 



5.1.3 Asymptotic efficiency and relative error 

In order to investigate the relative error of (Ppo* {A))j\j, we first note that the variance of the 
estimator is given by 

v((p5^)),) = ^^{{pTFTa)),) = ^IE[(l{(Tr )(^) e A}^{iT:'r^) - P„(A)) 

" (36) 

where we have used (fT5| for N = 1. V((Pp«» (^))]^) can be estimated via the given number N of 
samples in a single simulation run, i.e., as the sample variance 
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Fig. 6 Simulation of r{G{z'-f^) > g^t) in the extended model ©, for z'^^'' = 1500 and z'-^^ = 2500. The 
probabilities were estimated independently with simple sampling, and with importance sampling at 19 
different threshold values (from 100 to 1000 in steps of 50). For IS, 9.5 * 10* samples were generated (5000 
per threshold); for SS, 2* 10^ samples were used. No estimates are obtained with SS at thresholds beyond 
600 or 920, respectively, in analogy with the situation in Fig. \E\ 



where the {tf^ are now considered as realisations of (T^ )^^\ We can thus estimate the squared 
relative error as 

rjliP:: ,A)^j--^^ ^4. (38) 

For simple sampling, one proceeds in the obvious analogous way (without tilting and reweighting) . 

In line with the limit discussed in Sec. I4.3[ we now consider Gn{z^^^) for system sizes n = rii, 
where Ui = nf^ + n-[^^ < i < 10, and we choose n'f^ = im^°'\ a £ {c, v, /}, for 1 < « < 10, 

as well as n'^'^ — m^'^^ /2, tiq"' = rrS'"^ /2, and n\^^ = rrSf^ (i.e., we simply 'multiply' the system, 
except for i = 0, which corresponds to 'half a system except for the foreign peptide, which cannot 
be split into two). We then simulate ¥{Gni{z^^'^) > ffact'^i/™) for two values of z^^'^ and a fixed 
value of jact with our importance sampling method, as shown in Fig. [7l 

Obviously, the (estimated) probabilities decay to zero at an exponential rate with increasing 
n, as they must by their LDP. In contrast, the (estimated) squared RE only increases linearly 
- this even goes beyond the prediction of the theory (asymptotic efficiency only guarantees a 
subexponential increase) . 

So far, we have considered the n-dependence of the method for a fixed value of ^act, in the light 
of the available asymptotic theory. For the practical simulation of the given T-cell problem, we 
now take the given system size n = m and numerically investigate the relative error as a function 
of (7act- Here, the exponential decay of ¥{G{z^^'>) > ^act) as a function of g^ct is decisive, which 
we have already observed in Fig. O and which goes together with the at-least-linear increase of / 
with 5act (recall that / is convex, and see Fig. [4]). Fig. [8] shows the relative error of both SS and 
IS. It does not come as a surprise that, again, IS does extremely well and beats the exponential 
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Fig. 7 Importance sampling simulations for f'iGn^z'^^'') > gactn/m) for n = rii, < i < 10, for gact ~ 400 

and two values of a'^^ Left: estimate of the probability (note that the vertical axis is on logarithmic scale). 
Right: estimated squared RE. 



decay of the probabilities: whereas, on the log scale of the vertical axis, the squared RE of SS 
grows roughly linearly, it remains more or less constant for IS. (The very low squared RE of the 
simple sampling graphs for low thresholds in the right panel is due to the fact that the probability 
to reach this threshold is quite high and the huge sample of = 1.3 ■ 10^ contributes to estimating 
it, that is, the sample sizes are not comparable. A simple sampling simulation run with the total 
sample size of a corresponding IS simulation (i.e., N = 10000 times the number of steps contained 
in the interval considered) results in higher relative errors than for importance sampling even for 
the low threshold values (left panel). We would like to note, however, that the runtime of simple 
sampling for these small sample sizes is shorter than the runtime for IS, even if one does not count 
the overhead required to get the tilting parameters for importance sampling.) 



IS-RE for z<-f> = 1000 — 1 

IS-RE for z<fl = 2000 -'< - 

SS-RE for z<fl = 1000 o.oi - 




Fig. 8 Estimated squared RE for simple sampling (A^ = 10000 times the number of steps contained in 
the considered interval (left), A'^ = 1.3 * 10® (right)), and importance sampling (A^ = 10000 per threshold 
value in either panel) simulations of P(G'(z'^^) > (/act) of the basic model, Eq. Note that the vertical 
axis is on logarithmic scale. 



Figure [9] sheds more light on the behaviour of the relative error of the IS simulation. It shows 
the squared RE for 6 distinct ^^■''-'-values and reveals the finite-size effects. The wave- like behaviour 
for larger z^-^^ is due to the fact that, for very low threshold values, there is no real need for tilting, 
because the original distribution P„ is already close to optimal and the tilting factor is very small. 
For increasing thresholds, substantial tilting is required, but there are still visible deviations from 
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Fig. 9 Estimated squared RE of our IS estimate, for various frequencies z'-'^' of the foreign antigen. Details 
are as in Fig. [8l but now the vertical axis is on linear scale. 

the n ^ oo limit (as already discussed in the context of Fig. [5]), so the tilted distributions are not 
optimal. This produces the hump in the squared RE curves, which is more pronounced for larger 
z^^^ values because, for the case n = m considered here, the foreign antigens come as a single term 
that may stand out. For large gact, finally, one gets close enough to the Hmit, and the expected 
sub-exponential increase sets in (in our case, it is, in fact, roughly Hnear). Nevertheless, it should 
be clear that, in spite of the slight non-optimality at small threshold values, our tilted distributions 
still yield a far lower squared RE than does simple sampling. A very similar picture emerges for the 
extended model; surprisingly, the relative error is no larger than in the basic model, although the 
ad hoc simulation method used here is not asymptotically efficient (see Sec. l4.4j data not shown). 

5.2 Analysis of the T-cell model 

In this Section, we use our simulation method to obtain more detailed insight into the phenomenon 
of statistical recognition in the T-cell model. As discussed before, the task is to discriminate one 
foreign antigen type against a noisy background of a large number of self antigens. We already 
know from Fig.[5]that, for threshold values that are not too small, the activation probability in the 
presence of foreign antigens is several orders of magnitude larger than the activation probability 
of the self-background, i.e. Eq. |[6| is satisfied. As discussed in [37], this distinction relies on 
zif) > z^'^\z^'"^ - what happens is that larger copy numbers of the foreign antigen thicken the 
tail of the distribution of G{z^^^) (without changing its mean), so that the threshold is more 
easily surpassed. The self-nonself distinction may, according to this model, be roughly described 
as follows. For a given antigen (foreign or self), finding a highly-stimulating T-cell receptor is a 
rare event; but if it occurs to a foreign antigen, it occurs many times simultaneously since there 
are numerous copies, which all contribute the same large signal, since all receptors of the T-cell 
involved are identical; the resulting stimulation rate is thus high. In contrast, if it is a self antigen 
that finds a highly-stimulating receptor, the effect is less pronounced due to the smaller copy 
numbers. In this sense, the toy model explains the distinction solely on the basis of copy numbers; 
but see the Discussion for more sophisticated effects that alleviate this requirement. 

Following these intuitive arguments, we now aim at a more detailed picture of how the self 
background looks, and how the foreign type stands out against it. To investigate this, it is useful 
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23.1 


35.6 


88.8 


134.9 


191.3 



Table 1 Sa mple means (left) and sample standard deviations (right) of the histograms in Fig. 1101 fleft) 
and Fig. [11] (i.e., the self-only case). 
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90.4 
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18.5 
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54.5 


39.2 



Table 2 Sample means (left) and sample standard deviations (right) of the histograms in Fig. [10] (right) 
and Fig. [12] (i.e., the case with foreign antigens). 



to consider the histograms of the total constitutive, variable, and foreign stimulation rates, i.e., 
the contributions of the constitutive sum, the variable sum, and the individual foreign term in the 
sum ([3]), either for all samples or for the subset of samples for which G(z/) > ^act, for various gact. 
Since this requires a higher resolution (and thus larger sample size) than the calculation of the 
activation probabilities alone, such analysis would be practically impossible with simple sampling. 
With IS, we again generated 10000 samples per g^ct value, from which between 30 and 70 percent 
turned out to reach the threshold. 




100 200 300 100 200 300 

stimulation rate stimulation rate 



Fig. 10 Histograms of the total stimulation rates of variable, constitutive, and foreign antigens, for 
^(/) _ Q (left) and z^-^^ = 1000 (right), in the basic model ([S]), when all samples are included. Sample size 
is 10000, and the vertical axis holds the number of samples whose total constitutive (variable, foreign) 
stimulation rates fall into given intervals. Note that the scaling of the vertical axis varies across diagrams. 



Figure [TOl shows the resulting histograms when all samples are included, and Figs. [TT] and [T2l 
show the histograms for the subset of samples that have surpassed four representative threshold 
values, without and with foreign antigen. Tables (Hand [2] summarise these results in terms of means 
and standard deviations. Finally, Fig. [TSl shows the corresponding two-dimensional statistics for 
all pairs of variable, constitutive, and foreign stimulation rates, again for various threshold values. 
(Figs. [TTI^TSl are based on the outcome of importance sampling without reweighting; normalising 
by the number of "successful" samples would result in an estimate of the conditional distribution, 
because the reweighting factors cancel out.) 

Let us start with the situation without foreign antigens, as displayed in Figs. [TOj (left) and 
[TT] as well as Table [TJ This already illustrates the fundamental difference between variable and 
constitutive antigens. Judging from the large number (to'"^ = 1500) of individual terms in the 
sum at low copy number (z^"' = 50), the variable stimulation rate is expected to be approximately 
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Fig. 11 Histograms of the total stimulation rates of variable and constitutive antigens, for z^^' = 0, in 
the basic model ([Sj, for samples that reach a given threshold value ((?act = 100 (upper left), (/act = 250 
(upper right), Qact — 500 (lower left), i^act — 1000 (lower right)). Sample size is 10000, and the vertical axis 
holds the number of samples that reach yact and whose total constitutive (variable, foreign) stimulation 
rates falls into given intervals. Note that the scaling of both axes varies across diagrams. 



normally distributed and fairly closely peaked around its mean - at least as long as no restriction on 
G{z^^^) is involved - and, as the Figure shows, this feature persists when G{z^^^) > (/act, practically 
independently of the threshold involved. So, the variable antigens form a kind of background that 
poses no difficulty to foreign-self distinction: it is not very noisy, and it does not change with the 
threshold. 

In contrast, the distribution of the constitutive activation rates is wider; this is due to the 
large copy numbers {z^'^^ = 500), the effect of which is not compensated by the smaller number of 
terms, m'^'^^ = 50. Furthermore, the normal approximation is not expected to be particularly good 
for the constitutive antigens - given the extreme asymmetry of the M^-distribution (see Fig. [3]) , 
the central Umit theorem will not average out the deviations at only m''^' = 50. In particular, 
the distribution remains asymmetric. With increasing threshold, this distribution moves to the 
right. The reason for this is that, in order to reach an increasing ^act, the tail events of the 
constitutive or the variable sum or both must be used, but it is "easier" (that is, more probable) 
to use the constitutive one because it contains more atypical events. In the language of large 
deviation theory, this is an example of the general principle that "large deviations are always 
done in the the least unlikely of all the unlikely ways" [HI Ch. I]. In the language of biology, the 
constitutive antigens are the problem of foreign-self distinction: due to their high copy numbers 
and incomplete averaging, fluctuations persist that occasionally induce an immune response even 
in the absence of foreign antigens. This occurs if a T-cell receptor happens to fit particularly well 
to one, or a number of, constitutive antigen types on an APC; due to their large copy numbers, 
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stimulation rate stimulation rate 

Fig. 12 Histograms of the total constitutive, variable and foreign stimulation rates for = 1000 in the 
basic model ([3|. Sample size is 10000, and the vertical axis holds the number of samples that reach the 
threshold pact and whose total constitutive (variable, foreign) stimulation rate falls into a given interval, 
for Qact — 100 (upper left), gact — 250 (upper right), gact = 500 (lower left), gact = 1000 (lower right). The 
maximal stimulation rate for the foreign antigens is 2:'-'^iu(l) — 367.9. Note that the scaling of both axes 
varies across diagrams. 



these few highly-stimulating types are then sufficient to surpass the threshold (in contrast, several 
highly-stimulating types would be required for the variable antigens to elicit a reaction, which is 
too improbable). 

Let us now turn to the picture with foreign antigen present (Figs. [lO] (right), [l2l SHI and 
Table E]). One salient feature here is that the variable stimulation rate behaves exactly as in the 
self-only case: closely peaked around a small mean, unchanged when {G{z^^'>) > gact} is imposed. 
The picture is thus dominated by the interplay of constitutive and foreign types. In line with Fig.[H 
the situation is similar in the case without restriction on G{z^^^) (Fig.[ini right) and the case when 
G(z(^)) > 100 (Fig. [la upper left). In particular, the foreign stimulation rate is closely peaked 
at 0; only the constitutive background has moved slightly to the right, exactly as in the self-only 
case. For g^ct = 250 (Fig. [121 upper right), where, according to Fig.[5l foreign-self distinction sets 
in, the foreign stimulation rate becomes prominent: the right branch of the M^-distribution now 
becomes populated, and the associated stimulation rates are large due to the large copy numbers 
involved. 

Nevertheless, for (/act = 250, the foreign stimulation rate is close to in a sizable fraction of the 
cases in which an immune reaction occurs - here, the reaction is brought about by the constitutive 
background, which moves to the right just as in the self-only case (but less pronounced). Fig. 
[131 shows that the constitutive and foreign stimulation rates are, indeed, negatively correlated: 
as is to be expected, low foreign rates are compensated by high constitutive rates and vice versa 
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(in contrast, the variable background hardly correlates with either the constitutive or the foreign 
stimulation rate). As in the self-only case, therefore, the level of unwanted activation ("self-only" or 
"mainly self, without appreciable foreign activation") is set by the tail behaviour of the constitutive 
background. However, if (/act is increased further (Fig. [I2l lower left), every T cell beyond the 
threshold displays high stimuli for the foreign antigen, their distribution shifting even further to 
the right and concentrating near the maximal stimulation rate given by the maximum of the 
function w of Eq. (U), more precisely, by z^^^'w{l). This maximum can, of course, not change by 
imposing restrictions on G{z'-^^); thus, any further increase of (/act (Fig. [I2l lower right) must 
then be matched by the by now familiar shift of the constitutive background. (This last panel is, 
however, less biologically realistic since the probabilities involved are too small to be relevant - 
after all, with about 10^ different T-cell types, threshold values that yield activation probabilities 
far below 10~^ even in the presence of foreign antigens offer no immune protection.) 

A further illustration of the onset of self-nonself distinction is presented in Fig. [141 Here we 
consider 

p(G(.(/)) - .(/)M.„..,„..,, > .a. I Gizui) > ,..0 . 

\r[^Li-[zyj ' ) > ffactj 

(39) 

i.e., the probability that, in a T-cell that is activated in the presence of foreign antigen, the self 
component alone would have been sufficient for the activation. From z^^^ = 1000 onwards, this 
probability decreases to quickly with increasing g^ct- Put differently, in large parameter regions, 
the foreign antigens do indeed make the difference, which is the decisive feature of self-nonself 
distinction. 



6 Conclusion and outlook 

We have established here a method of LD sampling that allows the convenient simulation of 
the rare events relevant to statistical recognition in the immune system. Thus a more thorough 
investigation of these events could be carried out. 

But this is only a first step, and the goal for future work is to use this or related methods 
to investigate biologically reahstic models. Indeed, the toy model considered here, which relies 
solely on distinction by copy numbers, does serve the aim to illustrate that distinction against a 
noisy background is, at all, possible, even without an intrinsic difference between self and nonself, 
and how this is related to the rare events in the tail of the background distribution. However, 
biologically realistic models have to take into account tolerisation mechanisms that make the T- 
cells less responsive to self antigens. One important such mechanism is so-called negative selection. 
Negative selection occurs during the maturation phase of young T-cells in the thymus, before 
they are released into the body. In a process similar to the one described by the toy model, 
they are confronted with APCs that present mixtures of various self antigens, and those T-cells 
whose activation rate surpasses a thymic activation threshold gthy < <?act are eliminated. When 
they are later, after leaving the thymus, confronted with mixtures of self and foreign antigens, the 
stimulation rates emerging from self and foreign are no longer i.i.d. (the self ones are biased towards 
smaller values and possibly negatively correlated). In fact, a simple model for negative selection 
was already described in BRB [33], and shown to drastically reduce the self background, so that 
foreign antigens do no longer require elevated copy numbers to be detected. More sophisticated 
models of negative selection have been formulated e.g. in [32]. However, their simulation still awaits 
the development of adequate methods. This is the purpose of ongoing work. 
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Fig. 13 Pairwise joint frequencies of the total constitutive, variable, and foreign stimulation rates, for 
those samples with G{z^^^) > pact in the basic model ([Sj (with z^-^^ = 1000). Greyscales correspond to 
number of samples falling into 2D-intervals defined by total stimulation rates of pairs of antigen types. 
Rows (from top to bottom): gact = 100, 250, 350, 500, 750, 1000; columns (from left to right): constitutive 
(horizontal) - variable (vertical); foreign (horizontal) - variable (vertical); foreign (horizontal) - constitu- 
tive (vertical). Lighter shading corresponds to higher frequencies. 
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Fig. 14 Fraction of samples whose self-component alone is above threshold, among those that reach the 
threshold in the presence of z'-^' foreign molecules, for various z'-^' (i.e., IS simulation of the probability 
in Eq. I|39p ). Sample size is 10000 for each gact value considered. 



28 



References 

1. Arstila, T., Casrouge, A., Baron, V., Even, J., Kannelopoulos, J., Kourilsky, P.: A direct estimate of 
the human a/? T cell receptor diversity. Science 286, 958-961 (1999). 

2. Asmussen, S.: Applied Probability and Queues. 2nd ed., Springer, New York (2003). 

3. Billingsley, P.: Probability and Measure. 3rd ed., Wiley, New York (1995). 

4. Borovsky, Z., Mishan-Eisenberg, G., Yaniv, E., Rachmilewitz, J.: Serial triggering of T cell receptors 
results in incremental accumulation of signaling intermediates. J. Biol. Chem. 277, 21529-21536 
(2002). 

5. Bucklew, J. A.: Introduction to Rare Event Simulation. Springer, New York (2004). 

6. Davis, S.J., Ikemizu, S., Evans, E.J., Fugger, L., Bakker, T.R., van der Merwe, P.A.: The nature of 
molecular recognition by T cells. Nat. Immunol. 4, 217-224 (2003). 

7. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Springer, New York (1998). 

8. den Hollander, F.: Large Deviations. AMS, Providence, RI (2000). 

9. Dieker, A., Mandjes, M.: On asymptotically efficient simulation of large deviation probabilities. 
Adv. Appl. Prob. 37, 539-552 (2005). 

10. Dushek, O., Coombs, D.: Analysis of serial engagement and peptide-MHC transport in T cell receptor 
microclusters. Biophys. J. 94, 3447-3460 (2008). 

11. Georgii, H.O.: Stochastics. de Gruyter, Berhn (2008). 

12. Gonzalez, P.A., Carreno, L.J., Coombs, D., Mora, J.E., Palmieri, E., Goldstein, B., Nathenson, S.G., 
Kalergis, A.M.: T-cell receptor binding kinetics required for T cell activation depend on the density of 
cognate ligand on the antigen-presenting cell. Proc. Natl. Acad. Sci. U.S. A 102, 4824-4829 (2005). 

13. Hlavacek, W.S., Redondo, A., Wofsy, C, Goldstein, B.: Kinetic proofreading in receptor-mediated 
transduction of cellular signals: receptor aggregation, partially activated receptors, and cytosolic mes- 
sengers. Bull. Math. Biol. 64, 887-911 (2002). 

14. Hunt, D. F., Henderson, R.A., Shabanowitz, J., Sakaguchi, K., Michel, H., Sevilir, N., Cox, A.L., 
Appella, E., Engelhard, V.H.: Characterization of peptides bound to the class I MHC molecule HLA- 
A2.1 by mass spectrometry. Science 255 (1992), 1261-1263. 

15. Kalergis, A.M., Boucheron, N., Doucey, M.A., Palmieri, E., Goyarts, E.C., Vegh, Z., Luescher, I.F., 
Nathenson, S.G.: Efficient T cell activation requires an optimal dwell-time of interaction between the 
TCR and the pMHC complex. Nat. Immunol. 2, 229-234 (2001). 

16. Kronmal, R.A., Peterson, A. J.: On the alias method for generating random variables from a discrete 
distribution. Amer. Stat. 33, 214-218 (1979). 

17. Lancet, D., Sadovsky, E., Seidelmann, E.: Probability model for molecular recognition in biological 
receptor repertoires: Significance to the olfactory system. Proc. Natl. Acad. Sci. U.S.A. 90, 3715-3719 
(1993). 

18. Lord, G.M., Lechler, R.I., George, A., I.: A kinetic differentiation model for the action of altered TCR 
ligands. Immunol. Today 20, 33-39 (1999). 

19. Madras, N.: Lectures on Monte-Carlo Methods. AMS, Providence, RI (2002). 

20. Mason, D.: A very high level of crossreactivity is an essential feature of the T-cell receptor. Im- 
munol. Today 19, 395-404 (1998). 

21. McKeithan, T.W.: Kinetic proofreading in T-cell receptor signal transduction. 
Proc. Natl. Acad. Sci. U.S.A. 92, 5042-5046 (1995). 

22. Rabinowitz, J.D., Beeson, C, Wulfing, C, Tate, K., Allen, P.M., Davis, M.M., McConnell, H.M.: 
Altered T-cell receptor ligands trigger a subset of early T cell signals. Immunity 5, 125-135 (1996). 

23. Rosenwald, S., Kafri, R., Lancet, D.: Test of a statistical model for molecular recognition in biological 
repertoires. J. Theor. Biol. 216, 327-336 (2002). 

24. Ross, S.M.: Simulation. Academic Press (2002). 

25. Rothenberg, E.V.: How T-cells count. Science 273, 78-80 (1996). 

26. Sadowsky, J.S., Bucklew, J. A.: On large deviations theory and asymptotically efficient Monte Carlo 
estimation. IEEE TIT 36, 579-588 (1990). 

27. Sousa, J., Carneiro, J.: A mathematical analysis of TCR serial triggering and down-regulation. 
Eur. J. Immunol. 30, 3219-3227 (2000). 

28. Stevanovic, S., Schild, H.: Quantitative aspects of T cell activation - peptide generation and editing 
by MHC class I molecule. Seminars Immunol. 11 (1999), 375-384. 

29. Utzny, C, Coombs, D., Muller, S., Valitutti, S.: Analysis of peptide/MHC-induced TCR downregula- 
tion: deciphering the triggering kinetics. Cell Biochem. Biophys. 46, 101-111 (2006). 

30. Valitutti, S., Lanzavecchia, A.: Serial triggering of TCRs: a basis for the sensitivity and specificity of 
antigen recognition. Immunol. Today 18, 299-304 (1997). 

31. Valitutti, S., Muller, S., Cella, M., Padovan, E., Lanzavecchia, A.: Serial triggering of many T-cell 
receptors by a few peptide-MHC complexes. Nature 375, 148-151 (1995). 

32. van den Berg, H.A., Molina-Paris, C: Thymic presentation of autoantigens and the efficiency of 
negative selection. J. Theor. Med. 5, 1-22 (2003). 

33. van den Berg, H.A., Rand, D.A., IBurroughs, N.J.: A reliable and safe T-cell repertoire based on 
low-affinity T-cell receptors. J. Theor. Biol. 209, 465-486 (2001). 

34. van den Berg, H.A., Rand, D.A.: Antigen presentation on MHC molecules as a diversity filter that 
enhances immune efficacy. J. Theor. Biol. 224, 249-267 (2003). 



29 



35. van den Berg, H.A., Rand, D.A.: Quantitative theory of T-cell responsiveness. Immunol. Rev. 216, 
81-92 (2007). 

36. Viola, A., Lanzavecchia, A.: T-cell activation determined by T-cell receptor number and tunable 
thresholds. Science 273, 104-106 (1996). 

37. Zint, N., Baake, E., den Hollander, F.: How T-cells use large deviations to recognize foreign antigens. 
J. Math. Biol. 57, 841-861 (2008). 



